Korpus: epo_wikipedia_2012

Weitere Korpora

3.7.3 Distribution of the string similarity for different rank ranges

Distribution of the Levenshtein distance for words of rank

String similarity for top-1.000 words
Distance Percentage of words
0 27.6074
1 49.6933
2 22.6994
String similarity for top-10.000 words
Distance Percentage of words
0 6.6457
1 44.4737
2 48.8806
String similarity for top-100.000 words
Distance Percentage of words
0 3.3017
1 30.5508
2 66.1476
String similarity for top-1.000.000 words
Distance Percentage of words
0 3.2870
1 30.3582
2 66.3547
838 msec needed at 2017-12-12 06:48